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D , Abstract 



In recent work discussing model choice for continuous-time Markov chains, we have argued 
that it is important that the Markov matrices that define the model are closed under 
matrix multiplication [61 [7] . The primary requirement is then that the associated set of 
rate matrices form a Lie algebra. For the generic case, this connection to Lie theory seems 
to have first been made by [3], with applications for specific models given in [T] and [2J. 
Here we take a different perspective: given a model that forms a Lie algebra, we apply 
existing Lie theory to gain additional insight into the geometry of the associated Markov 
matrices. In this short note, we present the simplest case possible of 2 x 2 Markov matrices. 
The main result is a novel decomposition of 2 x 2 Markov matrices that parameterises the 
general Markov model as a perturbation away from the binary-symmetric model. This 
alternative parameterisation provides a useful tool for visualising the binary-symmetric 
model as a submodel of the general Markov model. 
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1 Results 

Consider the set of real 2x2 Markov matrices 

H ■ f fl - a b . 

CS ■ U , , : a,6 G 



a 1-6, 

and the subset of 2 x 2 "stochastic" Markov matrices 



I- a b 
a 1-6 



< a, 6 < 1 G 



In models of phylogenetic molecular evolution (see for example [1]), this set provides the 
transition matrices for what is known as the "general Markov model" on two states. If 
we were to take the additional constraint a = b, the model would then be referred to as 
"binary-symmetric" . 

Associated with these sets is the matrix group 

Q:={( l - a 1 b _ b )--a,beR,a + b^l\. 
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Figure 1: The group Q of invertible 2x2 Markov matrices of the form ( l ~ a a J^J understood 
geometrically as a manifold in M?. The gray area indicates the subset of "stochastic" Markov 
matrices. The line det(M) = indicates the boundary of the connected component to the 
identity, Q°. 



We can geometrically understand Q by considering it as a manifold in R 2 . This is illus- 
trated in Figure [TJ 

By considering smooth paths A(t) G Q, we can define the tangent space of this matrix 
group at the identity: 

Ti(0) = {A'(0) : A(t) G Q and A(0) = l} . 

As g is a matrix group, it follows that T±{g) forms a Lie algebra. This means that for all 
X, Y G T X {G) and A G R, we have: 

1. X + Al" G Ti(G), ie. T X {G) is a vector space, 

2. [x, K] := iy-yie T x {g). 

Consider two smooth functions a(t) and b(t) satisfying a(t) + b(t) ^ 1 for all t, and 
o(0) = 6(0) = 0. Define 

A-a(t) b(t) \ 

() "V 1 - K*) J • 

Then, by construction, A(t) is a smooth path in G and A'(0) € Ti(£/). If we define 
Lx := (-/») and L 2 := (gjj), we have A'(0) = o 7 (0)Li + 6'(0)L 2 , soTi(0) = (Li,L 2 ) R 
and {Li,L 2 } is a basis for Ti((/). It is straightforward to check that [Li,L 2 ] = L\ — L 2 , 
so we conclude that T X (G) is indeed a Lie algebra. 

Recall that a subgroup H < G of a group is normal if ghg -1 G i? for all /i G H and 
g d G. Also recall that the connected component to the identity G° is normal in G. In 
our case, this becomes: 

Result 1. g° = {M G g : det(M) > 0}. 

Proof. Consider M = ( 1 ~ a = e® 1 where Q := (^~® is a rate matrix (as would 

occur in a continuous-time formulation of a Markov process). Using the power series 
expansion of e^ 4 , it is straightforward to show that, if (a + b)t < 1, 

= -log(l-(a + b)t) -log(l-(a + fe)t) 

° 1 + b/a ' P l + a/b 

provides a solution to M = e® 1 . If we define the path A(t) := e***, we have A(0) = 1 and 
A(l) = M. Thus, Meg for all a + 6 < 1. On the other hand, if a + 6 > 1, there can be 
no path B(t) G G with 5(0) = 1 and B(l) = M because we would have det(£(r)) = 
for some r in the interval (0, 1]. □ 
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Corollary 1. g° = ! e Q : Q = ( * fj;a,^e 

Recall the homomorphism theorem for groups (see for example [5]), which ensures, for 
any group homomorphism p : G — > G' , that (i.) the image of p is a subgroup of G' , (ii.) 
the kernal K of p is normal in G, and (iii.) G/K = G 1 . To understand the set difference 
G — G°, we notice that Q° is the kernal of the homomorphism, 

g->{i,-i}^z 2 , 

M ^ sgn(det(M)). 

The kernal of this homomorphism is Q°, thus Q/Q° = {Q°,PQ } = Z2 for some P G G — Q°. 
For reasons of symmetry, we reflect the identity 1 in the line det(M) and set P = (JJ), 
noting that P 2 = 1. As Q/Q° is a partition of £7, we see that Q — Q° = PQ° and 

g = g° u 

Somewhat trivially: 
Result 2. As manifolds, g° = Pg°. 
Proof. Clearly, 

p : g° _> p£° 

M ^ FM, 

is a diffeomorphism because it maps continuous paths to continuous paths. □ 

In particular, this means that: 
Result 3. g° is connected 44> Pg° is connected 

Proof. No proof is required, but we give one regardless to illustrate. Consider the path 
A(t) = e Q2t e Ql ^-^ £ g° with A(Q) = Mi := e Ql and A(l) = M 2 := e Q2 . Now, B{t) := 
PA(t) is a path in Pg° with 5(0) = PM\ and B(l) = FM 2 . As any two points in Pg° 
can be written in this way, we are done. □ 

Recall that the center Z{G) of a group G is the set of all g G G such that gh = hg for 
all heG. In our case, suppose that iV = ( ^ c G Z(G). Setting NM = MN implies: 

1-c d \ (\ - a b \ _ ( * 6(1 - c - d) + d 

c l-d)\ a 1 - b) ~ \a(l - d- c) + c * 

1 -a 6 \ A -c d 
a l-ftj^c 1 - d 

* d(l-b-a) + b 

c(l — b — a) + a 

which is true if and only if —bc= —ad for all a and b. This can only happen if c = d = 0, 
thus Z((/) = {1}. Now, consider the basic theorem (see for example [5]): 

Theorem 1.1. If a matrix group G is path connected with discrete center, then any non- 
discrete normal subgroup H will have tangent space T\(H) ^ {0}. Further, T\{H) is an 
ideal ofT^G), ie. [X,Y] G T\{H) for all X G T X (H) and Y G T X {G). Therefore, any 
such H can be detected by checking for ideals ofT\(G). 

In our case, g° satisfies the conditions of this theorem. Suppose X is a proper ideal of 
Ti(g ). Then Z is one-dimensional, and Y := xL\ + yL 2 G X satisfies: 

[Y,L 1 ]=y(L 2 -L 1 ), and [Y, L 2 ] = x(L\ — L 2 ) G X, 

which can only be true if Y tx {L\ — L 2 ). 
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dot (A'/) = 



H, i.e. det(M) = 1 



Figure 2: Lie geometry of 2 x 2 Markov matrices 



Result 4. (y) K = (Li — ^2)k i/ie onZy proper ideal ofT\(Q°). 

Wetake Y = Li-L 2 and note that Y 2 = 0, so e ys = e^ Ll ' L ^ s = 1+Ys = { x ~ a 
h s . If we define the matrix group H := { ( 1 ~ s £7' 



i+ S s) :— 

s G M}, it is easy to confirm that H 



is normal in Q° and has tangent space Ti(H) = (Y) R . 

Let R^ be the set of positive real numbers considered as a group under multiplication. 



We have: 



Result 5. % is the kernal of the homomorphism Q c 

Thus g°/n 

Proof. 



defined by M ^ det(M). 



det(M) = l^a + b = 0o M 



1 — a —a 
a 1 + a 



□ 



Since h s ht = h s+t , i-e. % forms a one-parameter subgroup of Q°, we have % = R + , 
where R + = R is considered as a group under addition. Note that Q° /H is a parameterised 
partition of 0°, so we can write Q° /% = U t me Qt, H, where Q G Ti(£°) - Ti(^). We then 
see that any M € Q° can be written as a product e^*/i s , where det(M) = det(e < **). 



Again for reasons of symmetry we take Q = 
binary-symmetric model, and we have det(e^' 
This brings us to our main result. 



l / -l l 



! ) , i.e. Q is the generator of the 



A. 



Result 6. Any M 6 Q° can be expressed as 



M 



I- a 

a 



b 

1 - b 



Ph. 



l + e - * 
I-e"* 

1 + A 
1-A 



1 -e _t 
l + e"* 

1 - A\ A - a 
1 + aJ I s 



1 - s 
s 



— s 
l + s 



—s 
l + s 



(1) 



where det(M) = A 



1 



b, and s = \{a - b) det(M)" 



For the binary-symmetric model implemented as a stationary Markov chain, the pa- 
rameter A = e~ l is proportional to the expected number of transitions in chain in time 
t. Therefore we can think of the parameter s as providing a perturbation away from the 
binary-symmetric model. Additionally, to ensure that M is a stochastic Markov matrix, 
with a, b > 0, we require — |(e* — 1) < s < ^(e* — 1). 

The decomposition ([I]) is the main result of this note and is presented geometrically 
in Figure [21 It is remarkable that such a simple application of elementary Lie theory has 
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led directly to this decomposition, and it seems plausible that this decomposition may be 
useful in practice for (i.) computational efficiency, and/or (ii.) the simple interpretation 
of the parameters t, A and s. It will be interesting to explore whether a similar analysis 
leads to alternative parameterisation for other popular phylogenetic models that form Lie 
algebras, but we leave this for future work. 
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